|Vice - Inside the Intense, Insular World of AOL Disc Collecting|
|Forbes - Do Big Data Results Depend On What Data We Look At?|
|Forbes - History As Big Data: 500 Years Of Book Images And Mapping Millions Of Books|
|The Daily Targum - Founder of Internet Archive Brewster Kahle 'bytes' into dream of making all information public, free on Internet|
|Slate -The Creator of the Internet Archive Should Be the Next Librarian of Congress|
|CBC Radio -Why I spent my summer rescuing thousands of vintage manuals|
|The Atlantic - Introducing the Archive Corps|
|Popular Science - Can we save all the physical media in the world as digital data?|
|Washingtonian - The TBD.com Archives Vanish Again, Maybe for the Last Time|
|The Atlantic - The 2016 Candidates Who Are Making Headlines|
The Internet Archive is not interested in offering access to web sites or other Internet documents whose authors do not want their materials in the collection. To remove your site from the Wayback Machine, place a robots.txt file at the top level of your site (e.g. www.yourdomain.com/robots.txt).
The robots.txt file will do two things:
To exclude the Internet Archives crawler (and remove documents from the Wayback Machine) while allowing all other robots to crawl your site, your robots.txt file should say:
Robots.txt can be used to block access to the whole domain, or any file or directory within. There are a large number of resources for webmasters and site owners describing this method and how to use it, including http://www.robotstxt.org/.